-
Notifications
You must be signed in to change notification settings - Fork 306
make a WA in case Bert model do not have safetensor
file
#515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Liu, Kaixuan <[email protected]>
return FlashBert(model_path, device, datatype) | ||
try: | ||
return FlashBert(model_path, device, datatype) | ||
except: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can catch the specific error linked to no safetensors file rather than catching everything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the main purpose is to make the model run smoothly first, hence no matter whether it is an error caused by no safetensor file or sth else, we can use DefaultModel
path to do the backup choice. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem here is that we risk missing completely unrelated errors, which is not good.
Thinking about it more, I don't get why DefaultModel
would work in this case and not FlashBert
. Can you send a command to reproduce it and the logs of the error please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DefaultModel
path uses transformers modeling while FlashBert
implements modeling itself, and it reads weight file from safetensors L305, for some models they do not have this file, which will cause the error.
Cmd line to reproduce:
docker run -p 8080:80 -v $volume:/data --network host --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e MAX_WARMUP_SEQUENCE_LENGTH=512 -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --ipc=host tei_hpu --model-id BAAI/bge-base-zh-v1.5 --dtype bfloat16 --auto-truncate --pooling cls
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the try ... except ...
part.
Signed-off-by: Liu, Kaixuan <[email protected]>
@regisss Hi, I have updated the code, pls have a review. |
LGTM Thanks for this contribution. Not using |
@regisss @Narsil pls help review